12 research outputs found
DSLR-Quality Photos on Mobile Devices with Deep Convolutional Networks
Despite a rapid rise in the quality of built-in smartphone cameras, their
physical limitations - small sensor size, compact lenses and the lack of
specific hardware, - impede them to achieve the quality results of DSLR
cameras. In this work we present an end-to-end deep learning approach that
bridges this gap by translating ordinary photos into DSLR-quality images. We
propose learning the translation function using a residual convolutional neural
network that improves both color rendition and image sharpness. Since the
standard mean squared loss is not well suited for measuring perceptual image
quality, we introduce a composite perceptual error function that combines
content, color and texture losses. The first two losses are defined
analytically, while the texture loss is learned in an adversarial fashion. We
also present DPED, a large-scale dataset that consists of real photos captured
from three different phones and one high-end reflex camera. Our quantitative
and qualitative assessments reveal that the enhanced image quality is
comparable to that of DSLR-taken photos, while the methodology is generalized
to any type of digital camera
WESPE: Weakly Supervised Photo Enhancer for Digital Cameras
Low-end and compact mobile cameras demonstrate limited photo quality mainly
due to space, hardware and budget constraints. In this work, we propose a deep
learning solution that translates photos taken by cameras with limited
capabilities into DSLR-quality photos automatically. We tackle this problem by
introducing a weakly supervised photo enhancer (WESPE) - a novel image-to-image
Generative Adversarial Network-based architecture. The proposed model is
trained by under weak supervision: unlike previous works, there is no need for
strong supervision in the form of a large annotated dataset of aligned
original/enhanced photo pairs. The sole requirement is two distinct datasets:
one from the source camera, and one composed of arbitrary high-quality images
that can be generally crawled from the Internet - the visual content they
exhibit may be unrelated. Hence, our solution is repeatable for any camera:
collecting the data and training can be achieved in a couple of hours. In this
work, we emphasize on extensive evaluation of obtained results. Besides
standard objective metrics and subjective user study, we train a virtual rater
in the form of a separate CNN that mimics human raters on Flickr data and use
this network to get reference scores for both original and enhanced photos. Our
experiments on the DPED, KITTI and Cityscapes datasets as well as pictures from
several generations of smartphones demonstrate that WESPE produces comparable
or improved qualitative results with state-of-the-art strongly supervised
methods
Matching features correctly through semantic understanding
© 2014 IEEE. Image-to-image feature matching is the single most restrictive time bottleneck in any matching pipeline. We propose two methods for improving the speed and quality by employing semantic scene segmentation. First, we introduce a way of capturing semantic scene context of a keypoint into a compact description. Second, we propose to learn correct matchability of descriptors from these semantic contexts. Finally, we further reduce the complexity of matching to only a pre-computed set of semantically close keypoints. All methods can be used independently and in the evaluation we show combinations for maximum speed benefits. Overall, our proposed methods outperform all baselines and provide significant improvements in accuracy and an order of magnitude faster keypoint matching.Kobyshev N., Riemenschneider H., Van Gool L., ''Matching features correctly through semantic understanding'', 2nd international conference on 3D Vision - 3DV 2014, pp. 472-479, December 8-11, 2014, Tokyo, Japan.status: publishe
Architectural decomposition for 3D landmark building understanding
© 2016 IEEE. Decomposing 3D building models into architectural elements is an essential step in understanding their 3D structure. Although we focus on landmark buildings, our approach generalizes to arbitrary 3D objects. We formulate the decomposition as a multi-label optimization that identifies individual elements of a landmark. This allows our system to cope with noisy, incomplete, outlier-contaminated 3D point clouds. We detect three types of structural cues, namely dominant mirror symmetries, rotational symmetries, and polylines capturing free-form shapes of the landmark not explained by symmetry. Combining these cues enables modeling the variability present in complex 3D models, and robustly decomposing them into architectural structural elements. Our architectural decomposition facilitates significant 3D model compression and shape-specific modeling.Kobyshev N., Riemenschneider H., BĂłdis-SzomorĂș A., Van Gool L., ''Architectural decomposition for 3D landmark building understanding'', IEEE winter conference on applications of computer vision - WACV 2016, 10 pp., March 7-9, 2016, Lake Placid, NY, USA.status: publishe
Architectural decomposition for 3D landmark building understanding
Decomposing 3D building models into architectural elements is an essential step in understanding their 3D structure. Although we focus on landmark buildings, our approach generalizes to arbitrary 3D objects. We formulate the decomposition as a multi-label optimization that identifies individual elements of a landmark. This allows our system to cope with noisy, incomplete, outlier-contaminated 3D point clouds. We detect three types of structural cues, namely dominant mirror symmetries, rotational symmetries, and polylines capturing free-form shapes of the landmark not explained by symmetry. Combining these cues enables modeling the variability present in complex 3D models, and robustly decomposing them into architectural structural elements. Our architectural decomposition facilitates significant 3D model compression and shape-specific modeling
Efficient architectural structural element decomposition
© 2016 Elsevier Inc. Decomposing 3D building models into architectural elements is an essential step in understanding their 3D structure. Although we focus on landmark buildings, our approach generalizes to arbitrary 3D objects. We formulate the decomposition as a multi-label optimization that identifies individual elements of a landmark. This allows our system to cope with noisy, incomplete, outlier-contaminated 3D point clouds. We detect four types of structural cues, namely dominant mirror symmetries, rotational symmetries, shape primitives, and polylines capturing free-form shapes of the landmark not explained by symmetry. Our novel method combine these cues enables modeling the variability present in complex 3D models, and robustly decomposing them into architectural structural elements. Our proposed architectural decomposition facilitates significant 3D model compression and shape-specific modeling.Kobyshev N., Riemenschneider H., BĂłdis-SzomorĂș A., Van Gool L., ''Efficient architectural structural element decomposition'', Computer vision and image understanding, vol. 157, pp. 300-312, April 2017.status: publishe
3D saliency for finding landmark buildings
© 2016 IEEE. In urban environments the most interesting and effective factors for localization and navigation are landmark buildings. This paper proposes a novel method to detect such buildings that stand out, i.e. would be given the status of 'landmark'. The method works in a fully unsupervised way, i.e. it can be applied to different cities without requiring annotation. First, salient points are detected, based on the analysis of their features as well as those found in their spatial neighborhood. Second, learning refines the points by finding connected landmark components and training a classifier to distinguish these from common building components. Third, landmark components are aggregated into complete landmark buildings. Experiments on city-scale point clouds show the viability and efficiency of our approach on various tasks.Kobyshev N., Riemenschneider H., BĂłdis-SzomorĂș A., Van Gool L., ''3D saliency for finding landmark buildings'', 4th international conference on 3D Vision - 3DV 2016, pp. 267-275, October 25-28, 2016, Stanford, California, USA.status: publishe